Cell Systems
○ Elsevier BV
All preprints, ranked by how well they match Cell Systems's content profile, based on 167 papers previously published here. The average preprint has a 0.56% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Govers, S. K.; Campos, M.; Tyagi, B.; Laloux, G.; Jacobs-Wagner, C.
Show abstract
To examine how bacteria achieve robust cell proliferation across diverse conditions, we developed a method that quantifies 77 cell morphological, cell cycle and growth phenotypes of a fluorescently-labeled Escherichia coli strain and >800 gene deletion derivatives under multiple nutrient conditions. This approach revealed extensive phenotypic plasticity and deviating mutant phenotypes were often found to be nutrient-dependent. From this broad phenotypic landscape emerged simple and robust unifying rules (laws) that connect DNA replication initiation, nucleoid segregation, FtsZ-ring formation, and cell constriction to specific aspects of cell size (volume, length, or added length). Furthermore, completion of cell division followed the initiation of cell constriction after a constant time delay across strains and nutrient conditions, identifying cell constriction as a key control point for cell size determination. Our work provides a systems-level understanding of the design principles by which E. coli integrates cell cycle processes and growth rate with cell size to achieve its robust proliferative capability.
Shoyer, T. C.; Di Ventura, B.
Show abstract
Transcription factors (TFs) respond to external stimuli with time-varying changes in activity or localization (TF dynamics), driving differential transcriptional programs. Previous studies indicated that TF dynamics can be decoded at the promoter level in eukaryotes, yet a systematic understanding of robust solutions is lacking. By computationally screening over 10,000 mathematical models of multi-state promoters with various forms of TF-mediated regulation, we identify robust configurations that selectively respond to sustained ("pulse filtering") or pulsatile ("pulse boosting") TF dynamics. Promoters that activate via intermediate states and have negatively regulated deactivation robustly perform pulse filtering. In contrast, robust pulse boosting is achieved by promoters with a TF-mediated refractory state that permits short activation and recovers between pulses. Bifunctional TFs that exert activator- and repressor-like regulation extend the design space for pulse boosting. These results reveal general principles by which promoters interpret TF dynamics and suggest strategies to engineer synthetic systems to exploit them. HighlightsO_LIComputational screen of over 10,000 promoter models identifies features that enable promoters to selectively respond to sustained ("pulse filtering") or pulsatile ("pulse boosting") transcription factor (TF) dynamics. C_LIO_LIPromoters that activate via intermediate states and have negatively regulated deactivation robustly perform pulse filtering. C_LIO_LIPromoters with TF-regulated refractoriness robustly perform pulse boosting. C_LIO_LIPromoters regulated by bifunctional TFs extend the design space for pulse boosting. C_LI
Shaikh, R.; Reeves, G. T.
Show abstract
1The TGF-{beta}/Smad signaling pathway regulates growth, development, and homeostasis of tissues across the animal kingdom. The pathway is activated when transforming growth factor -{beta} (TGF-{beta}) binds to its cognate transmembrane receptors to activate Smad2 by phosphorylation. The activated phosphor-Smad2 (PSmad2) undergoes a series of biochemical interactions with transcriptional activator Smad4 to form several oligomers, including the transcription factor (PSmad2)2/Smad4, which regulates target gene expression. Quantitative live cell imaging and mathematical modeling have been used to estimate the dynamics of (PSmad2)2/Smad4. However, due to the emergent nature of Smad2-Smad4 interactions, deconvolving the dynamics of the (PSmad2)2/Smad4 is challenging. We show that the Smad model is sloppy, has large parameter uncertainties (O[~]1015), and the eigenvalues of the Fisher Information Matrix span over several decades. As such, well-fit parameter sets generate highly under-constrained predictions. To overcome this, we employ Profile Log-Likelihood to guide maximally informed design of experiments (MIDOE) and infer the dynamics of (PSmad2)2/Smad4. We generate these experiments computationally and validate that MIDOE can optimally constrain model predictions. We demonstrate that such careful analysis would not only improve the predictive power of models in systems biology but also reduce the time and expense of performing non-optimal experiments.
Bhamidipati, P. S.; Thomson, M.
Show abstract
Discovering biochemical circuits that exhibit a desired behavior is an outstanding problem in biological engineering. The traditional approach of enumerating every possible circuit topology becomes intractable for circuits with more than four components due to combinatorial scaling of the search space. Here, we use Monte Carlo Tree Search (MCTS), a reinforcement learning (RL) algorithm, to optimize circuit topology for a target phenotype by approaching circuit design as a sequence of assembly decisions. Our RL-based design framework, which we call CircuiTree, efficiently and comprehensively finds robust designs for three-component oscillators by prioritizing sparsity. CircuiTree can also infer candidate network motifs from its search results, producing similar results to enumeration. Using parallel MCTS, we scale this workflow up to five components and find that highly fault-tolerant designs use a novel strategy, which we call motif multiplexing. Multiplexed circuits contain many overlapping network motifs that each activate in different mutational scenarios. The evolutionary robustness of multiplexing may explain the ubiquity of multiple sub-oscillators in circadian clock circuits. Overall, CircuiTree provides the first scalable computational platform for designing biochemical circuits.
Connors, B. M.; Ertmer, S.; Clark, R. L.; Thompson, J.; Pfleger, B. F.; Venturelli, O. S.
Show abstract
Microbial communities have tremendous potential as therapeutics. However, a major bottleneck is manufacturing high-diversity microbial communities with desired species compositions. We develop a two-stage, model-guided framework to produce microbial communities with target species compositions. We apply this method to optimize the diversity of a synthetic human gut community. The first stage exploits media components to enable uniform growth responses of individual species and the second stage uses a design-test-learn cycle with initial species abundance as a control point to manipulate community composition. Our designed culture conditions yield 91% of the maximum possible diversity. Leveraging these data, we construct a dynamic ecological model to guide the design of lower-order communities with desired temporal properties over a longer timescale. In sum, a deeper understanding of how microbial community assembly responds to changes in environmental factors, initial species abundances, and inter-species interactions can enable the predictable design of community dynamics.
Stindt, K. R.; McClean, M. N.
Show abstract
The ability to modify and control natural and engineered microbiomes is essential for biotechnology and biomedicine. Fungi are critical members of most microbiomes, yet technology for modifying the fungal members of a microbiome has lagged far behind that for bacteria. Interdomain conjugation (IDC) is a promising approach, as DNA transfer from bacterial cells to yeast enables in situ modification. While such genetic transfers have been known to naturally occur in a wide range of eukaryotes, and are thought to contribute to their evolution, IDC has been understudied as a technique to control fungal or fungal-bacterial consortia. One major obstacle to widespread use of IDC is its limited efficiency. In this work, we utilize interactions between genetically tractable Escherichia coli and Saccharomyces cerevisiae to control the incidence of IDC. We test the landscape of population interactions between the bacterial donors and yeast recipients to find that bacterial commensalism leads to maximized IDC, both in culture and in mixed colonies. We demonstrate the capacity of cell-to-cell binding via mannoproteins to assist both IDC incidence and bacterial commensalism in culture, and model how these tunable controls can predictably yield a range of IDC outcomes. Further, we demonstrate that these lessons can be utilized to lastingly alter a recipient yeast population, by both "rescuing" a poor-growing recipient population and collapsing a stable population via a novel IDC-mediated CRISPR/Cas9 system.
Kim, E.; Gheorghe, V.; Hart, T.
Show abstract
Coessentiality networks derived from CRISPR screens in cell lines provide a powerful framework for identifying functional modules in the cell and for inferring the role of uncharacterized genes. However, these networks integrate signal across all underlying data, and can mask strong interactions that occur in only a subset of the cell lines analyzed. Here we decipher dynamic functional interactions by identifying significant cellular contexts, primarily by oncogenic mutation, lineage, and tumor type, and discovering coessentiality relationships that depend on these contexts. We recapitulate well-known gene-context interactions such as oncogene-mutation, paralog buffering, and tissue-specific essential genes, show how mutation rewires known signal transduction pathways, including RAS/RAF and IGF1R-PIK3CA, and illustrate the implications for drug targeting. We further demonstrate how context-dependent functional interactions can elucidate lineage-specific gene function, as illustrated by the maturation of proreceptors IGF1R and MET by proteases FURIN and CPD. This approach advances our understanding of context-dependent interactions and how they can be gleaned from these data. We provide an online resource to explore these context-dependent interactions at diffnet.hart-lab.org.
Ahmadi, S.; Sukprasert, P.; Artzi, N.; Khuller, S.; Schaffer, A. A.; Ruppin, E.
Show abstract
The availability of single-cell transcriptomics data opens new opportunities for rational design of combination cancer treatments. Mining such data, we employed combinatorial optimization techniques to explore the landscape of optimal combination therapies in solid tumors including brain, head and neck, melanoma, lung, breast and colon cancers. We assume that each individual therapy can target any one of 1269 genes encoding cell surface receptors, which may be targets of CAR-T, conjugated antibodies or coated nanoparticle therapies. As a baseline case, we studied the killing of at least 80% of the tumor cells while sparing more than 90% of the non-tumor cells in each patient, as a putative regimen. We find that in most cancer types, personalized combinations composed of at most four targets are then sufficient. However, the number of distinct targets that one would need to assemble to treat all patients in a cohort accordingly would be around 10 in most cases. Further requiring that the target genes be also lowly expressed in healthy tissues uncovers qualitatively similar trends. However, as one asks for more stringent and selective killing beyond the baseline regimen we focused on, we find that the number of targets needed rises rapidly. Emerging individual promising receptor targets include PTPRZ1, which is frequently found in the optimal combinations for brain and head and neck cancers, and EGFR, a recurring target in multiple tumor types. In sum, this systematic single-cell based characterization of the landscape of combinatorial receptor-mediated cancer treatments establishes first of their kind estimates on the number of targets needed, identifying promising ones for future development.
Guharajan, S.; Parisutham, V.; Brewster, R. C.
Show abstract
Transcription Factors (TFs) are often classified as activators or repressors, yet these context-dependent labels are inadequate to predict quantitative profiles that emerge across different promoters. The regulatory interplay between a TFs function and promoter features can be complex due to the lack of systematic genetic control in the natural cellular environment. To address this, we use a library of E. coli strains with precise control of TF copy number. We measure the quantitative regulatory input-output function of 90 TFs on synthetic promoters that isolate the contributions of TF binding sequence, location, and basal promoter strength to gene expression, uncovering TF specific regulatory principles. We infer that many of these TFs function by stabilizing RNA polymerase at the promoter, a property we see for both activating and repressing TFs. We develop a thermodynamic model that predicts stabilizing TFs have a specific quantitative relationship with promoters of differential strength. We test this prediction using synthetic promoters spanning over 100-fold range in basal expression levels and confirm that stronger promoters have lower fold-change for stabilizing TFs, whereas non-stabilizing TFs do not exhibit this relationship, indicating a conserved mechanism of transcription control across distinct TFs. This work demonstrates that understanding the intrinsic mechanisms of TF function is central to decoding the relationship between sequence and gene expression.
Suzuki, S. K.; Errede, B.; Dohlman, H. G.; Elston, T. C.
Show abstract
Cells rely on mitogen-activated protein kinases (MAPKs) to survive environmental stress. In yeast, activation of the MAPK Hog1 is known to mediate the response to high osmotic conditions. Recent studies of Hog1 revealed that its temporal activity is subject to both negative and positive feedback regulation, yet the mechanisms of feedback remain unclear. By designing mathematical models of increasing complexity for the Hog1 MAPK cascade, we identified pathway circuitry sufficient to capture Hog1 dynamics observed in vivo. We used these models to optimize experimental designs for distinguishing potential feedback loops. Performing experiments based on these models revealed mutual inhibition between Hog1 and its phosphatases as the likely positive feedback mechanism underlying switch-like, dose-dependent MAPK activation. Importantly, our findings reveal a new signaling function for MAPK phosphatases. More broadly, they demonstrate the value using mathematical models to infer targets of feedback regulation in signaling pathways.
Lopez-Malo, M.; Maerkl, S. J.
Show abstract
Transcription factors (TFs) regulate gene expression by binding cis-regulatory DNA elements, yet how trans-regulatory characteristics such as TF affinity, concentration, and localization interact with cis-regulatory elements remains largely unclear. We systematically analyzed TF affinity mutants across abundance, and localization states and found that promoter binding-site strength most readily modulated expression levels, followed by TF localization and concentration, while affinity variations were mainly buffered. We further uncover performance trade-offs between TF abundance, localization, and affinity. Together, these results reveal how trans and cis factors collectively shape gene-regulatory output.
Su, Y.; Li, G.; Ko, M. E.; Cheng, H.; Zhu, R.; Xue, M.; Wang, J.; Lee, J. W.; Frankiw, L.; Xu, A.; Wong, S.; Robert, L.; Takata, K.; Huang, S.; Ribas, A.; Levine, R.; Nolan, G. P.; Wei, W.; Plevritis, S. K.; Baltimore, D.; Heath, J. R.
Show abstract
The determination of individual cell trajectories through a high-dimensional cell-state space is an outstanding challenge, with relevance towards understanding biological changes ranging from cellular differentiation to epigenetic (adaptive) responses of diseased cells to drugging. We report on a combined experimental and theoretic method for determining the trajectories that specific highly plastic BRAFV600E mutant patient-derived melanoma cancer cells take between drug-naive and drug-tolerant states. Recent studies have implicated non-genetic, fast-acting resistance mechanisms are activated in these cells following BRAF inhibition. While single-cell highly multiplex omics tools can yield snapshots of the cell state space landscape sampled at any given time point, individual cell trajectories must be inferred from a kinetic series of snapshots, and that inference can be confounded by stochastic cell state switching. Using a microfludic-based single-cell integrated proteomic and metabolic assay, we assayed for a panel of signaling, phenotypic, and metabolic regulators at four time points during the first five days of drug treatment. Dimensional reduction of the resultant data set, coupled with information theoretic analysis, uncovered a complex cell state landscape and identified two distinct paths connecting drug-naive and drug-tolerant states. Cells are shown to exclusively traverse one of the two pathways depending on the level of the lineage restricted transcription factor MITF in the drug-naive cells. The two trajectories are associated with distinct signaling and metabolic susceptibilities, and are independently druggable. Our results update the paradigm of adaptive resistance development in an isogenic cell population and offer insight into the design of more effective combination therapies.
Wang, B.; Thachuk, C.; Soloveichik, D.
Show abstract
Molecular control circuits embedded within chemical systems to direct molecular events have transformative applications in synthetic biology, medicine, and other fields. However, it is challenging to understand the collective behavior of components due to the combinatorial complexity of possible interactions. Some of the largest engineered molecular systems to date have been constructed from DNA strand displacement reactions, in which signals can be propagated without a net change in base pairs. For linear chains of such enthalpy-neutral displacement reactions, we develop a rigorous framework to reason about interactions between regions that must be complementary. We then analyze desired and undesired properties affecting speed and correctness of such systems, including the spurious release of output (leak) and reversible unproductive binding (toehold occlusion), and experimentally confirm the predictions. Our approach, analogous to the rigorous proofs of algorithm correctness in computer science, can guide engineering of robust and efficient molecular algorithms.
Weinstein, E. N.; Gollub, M. G.; Slabodkin, A.; Gardner, C. L.; Dobbs, K.; Cui, X.-B.; Amin, A. N.; Church, G. M.; Wood, E. B.
Show abstract
We introduce a method to reduce the cost of synthesizing proteins and other biological sequences designed by a generative model by as much as a trillion-fold. In particular, we make our generative models manufacturing-aware, such that model-designed sequences can be efficiently synthesized in the real world with extreme parallelism. We demonstrate by training and synthesizing samples from generative models of antibodies, T cell antigens and DNA polymerases. For example, we train a manufacturing-aware generative model on 300 million observed human antibodies and synthesize[~] 1017 generated designs from the model, achieving a sample quality comparable to a state-of-the-art protein language model, at a cost of 103 dollars. Using previous methods, synthesis of a library of the same accuracy and size would cost roughly a quadrillion (1015) dollars.
Shimpi, A. A.; Naegle, K. M.
Show abstract
Evolution has developed a set of principles that determine feasible domain combinations, analogous to grammar within natural languages. Treating domains as words and proteins as sentences, made up of domain words, we apply a linguistic approach to represent the human proteome as an n-gram network, which we call hereafter as Domain Architecture Network Syntax (DANSy). Combining DANSy with network theory, we explore the abstract rules of domain word combinations within the human proteome and identify connections that determine feasible protein functionality. We analyze the entropic information content of these domain word connections to establish a DANSy network that balances recovering most of proteome with n-gram complexity. Additionally, we explored subnetwork languages by focusing on reversible post-translational modifications (PTMs) systems that follow a reader-writer-eraser paradigm. We find that PTM systems appear to sample grammar rules near the onset of the system expansion, but then converge towards similar grammar rules, which stabilize during the post-metazoan switch. For example, reader and writer domains are typically tightly connected through shared n-grams, but eraser domains are almost always loosely or completely disconnected from readers and writers. Additionally, after grammar fixation, domains with verb-like properties, such as writers and erasers, never appear together - consistent with the idea of natural grammar that leads to clarity and limits futile enzymatic cycles. Given how some cancer fusion genes represent the possibility for the emergence of novel language, we investigate how cancer fusion genes alter the human proteome n-gram network. We find most cancer fusion genes follow existing grammar rules. Finally, we adapt our DANSy analysis for differential expression (deDANSy) analysis to determine the relationship of coordinated changes in domain language syntax to cell phenotypes. We applied deDANSy to RNA-sequencing data from SOX10-deficient melanoma cells, finding that we can use network separation and syntax enrichment to characterize the molecular basis of cell phenotypes and identify novel information distinct from gene set enrichment analysis (GSEA) approaches. Collectively, these results suggest that n-gram based analysis of proteomes is a complement to direct protein interaction approaches, is more fully described than protein-protein interaction networks, and can be used to provide unique insights for signaling pathway enrichment analysis.
Visani, G. M.; Verma, A.; DeWitt, W. S.
Show abstract
Recent work from Tran et al. (Science, 2026) introduced MULTI-evolve, a framework for protein engineering that combines single-mutant nomination via a protein language model (PLM) or a deep mutational scan (DMS), experimental single- and double-mutant characterization, and neural networks to engineer hyperactive multimutant proteins. The authors attribute the frameworks performance to "epistasis-aware modeling" and claim that their neural networks "learn the epistatic landscape" and "identify synergistic interactions" from limited double-mutant training data. Additive models, by definition, cannot represent epistasis, making them a natural null baseline for such claims. Here we show that MULTI-evolves multimutant predictions are almost perfectly correlated with an additive models across all three engineering applications (APEX, dCasRx, and HuABC2), such that the engineering of multimutants reduces to combining beneficial mutations with the largest additive effects--a standard protein engineering strategy for over four decades. We also find that MULTI-evolves neural networks do not outperform an additive model on held-out test set predictions, and do not even represent epistasis in their training data. Finally, we revisit a DMS benchmark finding presented as evidence of epistasis learning and show that the same pattern is expected even under a null additive model, due to an elementary statistical phenomenon; when we fit an additive model to the benchmark data, it reproduces the reported pattern. More broadly, our findings underscore the need to benchmark models for machine learning-guided directed evolution against additive null baselines before attributing performance to learned epistasis.
Jones, A.; Tsherniak, A.; McFarland, J.
Show abstract
While chemical and genetic viability screens in cancer cell lines have identified many promising cancer vulnerabilities, simple univariate readouts of cell proliferation fail to capture the complex cellular responses to perturbations. Complementarily, gene expression profiling offers an information-rich measure of cell state that can provide a more detailed account of cellular responses to perturbations. Relatively little is known, however, about the relationship between transcriptional responses to per-turbations and the long-term cell viability effects of those perturbations. To address this question, we integrated thousands of post-perturbational transcriptional profiles from the Connectivity Map with large-scale screens of cancer cell lines viability response to genetic and chemical perturbations. This analysis revealed a generalized transcriptional signature associated with reduced viability across perturbations, which was consistent across post-perturbation time-points, perturbation types, and viability datasets. At a more granular level, we lay out the landscape of treatment-specific expression-viability relationships across a broad panel of drugs and genetic reagents, and we demonstrate that these post-perturbational expression signatures can be used to infer long-term viability. Together, these results help unmask the transcriptional changes that are associated with perturbation-induced viability loss in cancer cell lines.
Müller, S.; Predl, M.; Szeliova, D.; Zanghellini, J.
Show abstract
Bridging the gap between mechanistic models of metabolism and ecological theory remains a key challenge in understanding interactions within microbial communities. We propose a geometric framework for analyzing community metabolism, based on constraint-based modeling. By extending pathway analysis methods from single organisms to multi-species systems, we define the community metabolic space as the set of all feasible fluxes between species and their environment, conditional on growth rate and medium composition. Embedded within a nonlinear geometry, this space forms a polytope whose vertices represent the minimal building blocks of community metabolism from which every feasible solution can be constructed. Strikingly, these elementary community flux modes invite direct ecological interpretation - as specialist, commensalist, or mutualist modes of cooperation. Furthermore, we find that mutualism is either isotypic (arising from a minimal mutualistic behavior) or anisotypic (emerging from a combination of reciprocal commensalists). This distinction demonstrates that bidirectional cross-feeding alone is insufficient to determine the ecological interaction type. Our framework also offers significant potential for applications. Because it does not rely on optimization, powerful unbiased tools from metabolic engineering, such as production envelopes and minimal cut sets, may be extended to microbial communities. Taken together, this perspective aims to unify ecological and metabolic viewpoints by linking interaction types to the geometric structure of the community metabolic space, thereby laying the foundation for a deeper understanding of community structure, function, and design.
Heidenreich, M.; Mathur, S.; Shu, T.; Xie, Y.; Sriker, D.; Dubreuil, B.; Holt, L. J.; Levy, E. D.
Show abstract
Biomolecular organization is central to cell function. While phase separation is a key mechanism orchestrating this organization, we lack a comprehensive view of genes that can globally influence this process in vivo. To identify such genes, we combined functional genomics and synthetic biology. We developed a bioorthogonal system that can identify changes in the intracellular milieu that globally tune phase separation. We measured in vivo phase diagrams of a synthetic system across >25 million cells in 2,888 yeast knockouts, and identified 68 genes whose deletion alters the phase boundaries of the synthetic system, an unexpected result given the systems bioorthogonal design. Genes involved in TORC1 signaling and metabolism, particularly carbohydrate-, amino acid- and nucleotide synthesis were enriched. The mutants that changed phase separation also showed high pleiotropy, suggesting that phase separation interrelates with many aspects of biology. Highlights- A synthetic protein system reveals the genetic and environmental tunability of protein phase separation - Genetic knockouts affecting phase separation are highly pleiotropic - Carbohydrate, amino acid, and nucleotide metabolism contribute to modulating phase separation potential - Protein phase separation is a globally tunable property of the intracellular environment O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=200 SRC="FIGDIR/small/620319v1_ufig1.gif" ALT="Figure 1"> View larger version (81K): org.highwire.dtl.DTLVardef@4b8403org.highwire.dtl.DTLVardef@1c80503org.highwire.dtl.DTLVardef@c10bf2org.highwire.dtl.DTLVardef@1f76c71_HPS_FORMAT_FIGEXP M_FIG C_FIG
Loman, T. E.; Schwall, C. P.; Saez, T.; Liu, Y.; Locke, J.
Show abstract
Genetic circuits with only a few components can generate complex gene regulatory dynamics. Here, we combine stochastic modelling and single-cell time-lapse microscopy to reveal the possible dynamics generated by a key gene circuit motif: the mixed positive/negative feedback loop. Our minimal stochastic model of this motif reveals ten distinct classes of dynamic output, including stochastic pulsing, oscillations, and bistability. We systematically map how the circuits core parameters can be tuned to generate each of the behaviours. Experimental validation in two different mixed feedback circuits in the bacterium Bacillus subtilis, {sigma}B and {sigma}V, confirms our models predictive power. Guided by our simulations, we are able to transition between dynamic behaviours by modulating in vivo parameters. Together, these results demonstrate how mixed feedback loops generate diverse single-cell dynamics, improving our understanding of this common biological network motif and informing our efforts to engineer them for synthetic biology applications.